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ABSTRACT 

We measure the covariance of the non-linear matter power spectrum from TV-body 
C^l , simulations using two methods. In the first case, the covariance of power is estimated 

from the scatter over many random realizations of the density field. In the second, we 
use a novel technique to measure the covariance matrix from each simulation individ- 
ually by re-weighting the density field with a carefully chosen set of functions. The 
two methods agree at linear scales, but unexpectedly they disagree substantially at 
increasingly non-linear scales. Moreover, the covariance of non-linear power measured 
using the re-weightings method changes with box size. The numerical results are con- 
sistent with an explanation given in a companion paper, which argues that the cause 
, of the discrepancy is beat-coupling, in which products of Fourier modes separated by 

a small wavevector couple by gravitational growth to the large-scale beat mode be- 
tween them. We calculate the information content of the non-linear power spectrum 
(about the amplitude of the initial, linear power spectrum) using both methods and 
I 1 confirm the result of a previous paper, that at translinear scales the power spectrum 

contains little information over and above that in the linear power spectrum, but that 
there is a marked increase in information at non-linear scales. We suggest that, in real 
^ ' galaxy surveys, the covariance of power at non- linear scales is likely to be dominated 

by beat-coupling to the largest scales of the survey and that, as a result, only part of 
the information potentially available at non-linear scales is actually measurable from 
real galaxy surveys. 

Key words: cosmology: theory - large-scale structure of Universe. 



1 INTRODUCTION 

Recent progress in cosmological parameter estimation has 
been characterized by rapid convergence of constraints from 
a wealth of different types of observations - galaxy clus- 
tering, the Lyman-a forest power spectrum, galaxy cluster 
abundances, high-redshift type la supernovae, weak gravi- 
tational lensing, big-bang nucleosynthesis and many others 
- toward a single, well-determined 'concordance' cosmolog- 
ical model. Of particular importance has been the combi- 
nation of high precision maps of anisotropies in the cosmic 
microwave background (CMB) with measurements of galaxy 
clustering from large redshift surveys, which yield comple- 
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are generally restricted to scales > 20 Mpc, where den- 
sity fluctuations are still linear, galaxy-matter bias appears 
to be independent of scale (although it does depend on lu- 
minosity and on galaxy type) and the matter power spec- 
trum directly traces the spectrum of density fluctuations 
at recombination. At smaller scales, where much of the ob- 
servational data in galaxy surveys lie, the extent to which 
the linear power spectrum can be recovered from the non- 
linear power spectrum remains unknown. Non-linear evolu- 
tion changes the shape of the power spectrum in a non-trivial 
way, and introduces broad correlations between measure- 
ment s of power at different wavenumbers iMeiksin fc White ! 
1999; IScoccimarro. Zaldarriaga fc Huill999l : ICoorav fc Hul 
2001). The early success of analytic formalisms at pro- 
ducing invertible o ne-to-one mappings between linear and 
non- l inear spectra llHamilton et alJll99lt IPeacock fc Doddsl 
I1994L 1996) suggested that information in the linear 
power spectrum may b e preserved into the non-linear 
regime, but other wor k llMeiksin. White fc Peacockl Il999l : 
ISeo fc Eisensteir]|2003) has shown that non-linear evolution 
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erases features in the power spectrum, perhaps leading to 
an irreversible loss of information. 

In an earlier letter jRimes fc Hamiltonll2005l hereafter 
Paper I) , we reported measurements of the amount of infor- 
mation in the non-linear power spectrum about the ampli- 
tude of the linear power spectrum for the currently favoured 
(concordance) cosmological model, from a large ensemble 
of TV-body simulations. We have since discovered a small 
error in our calculations. In Fig. Q of the present paper 
we present a revised version of the results from Paper I. 
Our conclusions remain unchanged: namely, that there ex- 
ists little independent information in the translinear regime 
(k ~ 0.2-0.8 /iMpc" 1 at the present day) over and above 
that in the linear power spectrum, but that in the fully non- 
linear regime there appears to be a significant amount of 
information beyond that measurable from the linear power 
spectrum. 

Measuring the information content of the non-linear 
power spectrum involves measuring the covariance matrix 
of non-linear power, which, if determined as in Paper I from 
the scatter over an ensemble, requires performing a large 
number of TV-body simulations. This is costly in terms of 
computing time, especially if the simulations are of high 
quality. In an attempt to reduce the computational over- 
head and to streamline the measurement of information, we 
devised a new technique, described in detail in a compan- 
ion paper (Ha milton. Rimes fc Scoccimarroll2005l hereafter 
HRS), for estimating the covariance of power from individ- 
ual simulations, by re- weighting the density field using a set 
of carefully chosen windows. 

In the present paper, we use the re-weighting technique 
to measure the amount of information in the non-linear 
power spectrum, for the same set of simulations used in Pa- 
per I. We compare the results to those obtained with the en- 
semble method and present a number of tests of the method. 

Unexpectedly, we find that, far from agreeing, the co- 
variance of power measured by the re-weightings method 
substantially exceeds that measured by the ensemble 
method. When we first encountered this discrepancy, it 
seemed to us that it must be caused by a 'bug' in our 
code, and we performed numerous numerical tests to track 
it down. Belatedly, we realized that the discrepancy was 
caused not by a bug but by a real physical process, which we 
term 'beat-coupling'. The physical origin of beat-coupling is 
described by HRS, who illustrate its effects with examples 
using perturbation theory and the hierarchical model. 

This paper is organized as follows. In Section [5] we set 
out our definition of information, and discuss the decorre- 
lation choices that must be made to allocate information 
to prescribed wavebands. This section also provides a more 
detailed exposition of the techniques employed in Paper I. 
Section |21 describes the numerical simulations used in both 
this paper and the previous one. Section [I] compares mea- 
surements of the covariance of power using both ensemble 
and re-weightings methods, and describes several tests of 
the results. In Section |K| we compare the information con- 
tent of the non-linear power spectrum measured using the 
two different methods. Our conclusions are summarized in 
Section |^1 



2 INFORMATION 

2.1 Fisher information 

The Fisher information ma trix is defined (e.g. 
iTeemark. Taylor fc Heavenslll997l) as 

F <*0 _ / d 2 \nC \ ^ 

\ dpadp/3 J ' 

where £(p Q |data) is the likelihood function - the multi- 
variate probability distribution of the model parameters p a 
given the available data and a set of model assumptions 
(the Bayesian prior). Fisher information is additive over in- 
dependent measurements, clearly a desirable property for 
information to possess. Its importance in parameter esti- 
mation is encapsulated in the Cramer- Rao inequality, which 
limits the maximum precision with which a single parameter 
p a can be measured to 

{Apl) > 1/Fac, (2) 

if the estimator p a is unbiased and if this is the only param- 
eter being estimated from the data. Here and throughout 
this paper we use hats to distinguish an estimate of a quan- 
tity from its true value. If estimates of the parameters are 
Gaussian distributed about their expectation values - a good 
approximation in the limit of a large amount of data, thanks 
to the central limit theorem - then their covariance matrix 
is well-approximated by the inverse of the Fisher matrix: 

(Ap a Ap p ) ~ (F~ 1 ) a/ 3. (3) 

2.2 Power spectrum 

We consider a statistically homogeneous and isotropic den- 
sity field p(r). The power spectrum P(k) of density fluctu- 
ations of such a field is defined by 

(8 k 5 h >) = {2nfS SD {k + k')P(k), (4) 

where 8k is the Fourier transform of the overdensity S(r) = 
p(r)/p— 1 and Ssoik) is a 3-dimensional Dirac delta func- 
tion. Statistical isotropy requires that the power spectrum 
be a function only of the magnitude k = \k\ of the wavevec- 
tor fc. 

For Gaussian fluctuations, each 8k has real and imag- 
inary components that are independently Gaussianly dis- 
tributed with variance P(k)/2. Usually, power is estimated 
by averaging over shells in Fourier space. For Gaussian fluc- 
tuations, the expected covariance matrix of shell-averaged 
estimates of power is diagonal, with variance 

(AP(kf) = 2P(k) 2 /N k , (5) 

where TV^ is the number of modes in the shell around k (a 
finite number in the case of a realistic galaxy survey or a 
periodic TV-body simulation). Here 5k and its complex con- 
jugate S-k are counted as contributing two distinct modes, 
the real and imaginary parts of 8k- 

2.3 Information in the power spectrum 

Here, as in Paper I, we measure the Fisher information / in 
a single parameter: the log of the amplitude A of the initial 
(post-recombination) matter power spectrum, that is, 
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d 2 ln£ 
din A 2 



(6) 



For Gaussian density fluctuations, the power spectrum com- 
pletely specifies the statistical properties of the density field, 
so that the only explicit dependence of the likelihood C is 
on the power spectrum. For non-Gaussian fluctuations, the 
likelihood function may also depend explicitly on other pa- 
rameters. However, as in Paper I, we consider only the infor- 
mation contained in the power spectrum P(k), in which case 
the information / defined by equation © can be expanded 
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The middle term on the right-hand side of equation J7J is the 
Hessian of the vector lnP(fc) of log non-linear powers, the 
expectation value of which is the Fisher matrix. The power 
spectrum P(k) is averaged over spherical shells in fc-space; a 
typical shell in the non-linear regime contains several thou- 
sand to hundreds of thousands of distinct Fourier modes, so 
it is reasonable to invoke the central limit theorem to as- 
sert that estimates of power will be Gaussianly distributed 
about their expectation values. This assertion holds even if 
the density field is itself non-Gaussian. We test this is ex- 
plicitly in Section r4.2.3l In the Gaussian approximation, the 
Hessian in equation (J7J can be approximated by the inverse 
of the covariance matrix of estimates of log-power. 

The remaining terms in equation J7| are two partial 
derivatives which describe the sensitivity of the non-linear 
power to changes in the amplitude A of the initial linear 
power. In the linear regime these derivatives are identically 
unity, since Ph(k) oc A; at non- linear scales they are equal 
to the growth rate of the non-linear power spectrum rela- 
tive to the linear, which can be conveniently measured from 
simulations. 

The information / has a particularly simple interpreta- 
tion for Gaussian fluctuations. Following equation JSJl, it is 
equal to half the total number N of Gaussian modes: 



/ = N/2. 



(8) 



As was found in Paper I and is confirmed in Section|^|of 
the present paper, the information in the non-linear power 
spectrum P(k) is significantly less than the information in 
the linear power spectrum at the same wavenumber. This 
decrease in information could result from a transfer of in- 
formation from larger to smaller scales, a diversion of infor- 
mation into other quantities (such as the bispectrum) , or an 
irreversible loss of information. In Paper I we argued that 
complete loss of information during translinear evolution is 
inconsistent with our finding that the total amount of in- 
formation on non-linear scales is increasing with time. The 
remaining two scenarios could, in principle, be distinguished 
by measuring the information the bispectrum and higher or- 
der statistics but this becomes progressively more difficult 
as the order increases. 



2.4 Decorrelated band powers 

The quantity I defined by equation is the total amount 
of information contained in the non-linear power spectrum 



about the parameter In A. Some of this information is degen- 
erate between measurements of power at different wavenum- 
bers as a result of the broad correlations introduced by non- 
line ar evolution. 

lHamilton fc Teemarkl i200(j) showed how to decorrelate 
a power spectrum by defining a set of windowed band-power 
estimates. Decorrelation is the process of assigning shared 
information uniquely to a given wavenumber. Here, we ex- 
tend their method to the case where we want to decorrelate, 
not the power spectrum itself, but estimates of some param- 
eter - in this c ase In A - made from t he pow er spectrum. 

Following Ha milton fc Teemarkl ((2000) , we define our 
windowed band-power estimates Bk by: 



Ph 
Pk 



(9) 



where we use the index notation Pk = P{k) to emphasize 
the fact that the shell-averaged power spectrum P(k) can be 
viewed as a discrete vector in Fourier space. The band-power 
windows Wky in equation © are elements of a decorrelation 
matrix, each column of the matrix being a discrete window 
for one band-power Bk ■ There are many (actually an infinite 
number) of schemes for decorrelating the power spectrum, 
corresponding to different ways of sharing out degenerate 
information between wave b ands. The reader is directed to 
lHamilton fc Tegmarkl i200(j) for a discussion of the relative 
merits of selected decorrelation schemes. 

Both sides of equation JUJ are scaled by Pk, which is a 
fiducial power spectrum. This scaling ensures that a given 
band power Bk is not dominated by leakage from wavenum- 
bers fc' where the window is small but Py is large. The 
choice Pk — (Pk) guarantees that the expectation value of 
the windowed band power estimates at each wavenumber 
is equal to the original power spectrum, provided that the 
windows are suitably normalized: 



(10) 



In order that the final estimates of In A are uncorrelated, 
the band-power windows must satisfy 

W T AW = DFD, (11) 

where F is the Fisher matrix of the scaled power spectrum 
and D is a diagonal matrix with diagonal elements 



Da 



d In P k 



(12) 



d\nA ' 

The matrix DFD can be interpreted as the Fisher matrix 
of estimates of In A from the power in different wavebands 
and it is these estimates (rather than the estimates of power 
themselves) that we want to decorrelate. We experimented 
with various decorrelation matrices, eventually opting for 
the upper triangular matrix U obtained from a generalized 
form of the Cholesky decomposition: 



U 1 AU = DFD, 



(13) 



where A is a diagonal matrix - the Fisher matrix of the 
decorrelated estimates of In A. Note that this is not the same 
as decorrelating the power spectrum using the Cholesky de- 
composition of F and then writing the Fisher matrix of the 
decorrelated estimates as DAD - as we did in Paper I - be- 
cause D does not commute with U. The two approaches are 
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only approximately equivalent in the case where the band- 
power windows are narrow or the elements of D, given by 
equation 1121 are a slowly varying function of k. The former 
is a particularly poor approximation, because of the broad 
correlations present in the non-linear power spectrum. In 
Section |5J we present a corrected version of the relevant fig- 
ure (fig. 3) from Paper I. Our conclusions are not altered 
significantly. 

In the central-limit-theorem approximation that esti- 
mates of power are Gaussianly distributed, the Fisher ma- 
trix F is approximately equal to the scaled inverse covariance 
of power: 



io 5 r 



F ~ P(APAP )" 1 P 1 



(14) 



Here, P is a diagonal matrix whose non-zero elements are 
equal to the fiducial power spectrum P^. Mathematically, 
upper Cholesky decorrelation is equivalent to taking a ma- 
trix composed of all the elements of the covariance matrix up 
to some wavenumber fc max , inverting this matrix, and sum- 
ming all the elements of the resulting Fisher matrix to arrive 
at a measure of the accumulated information J(^ fc max ) up 
to that wavenumber. 

Upper Cholesky decorrelation yields band-power win- 
dows that are highly asymmetric, with the band power 
at each wavenumber k containing contributions only from 
power on larger scales. A problem with the other schemes 
that we tried (including th e square root of the scaled Fi sher 
matrix, recommended by lHamilton fc Teemarkl l2000l) . is 
that there is an appreciable covariance between large, linear 
scales and small, non-linear scales. Applying anything other 
than upper Cholesky decorrelation assigns some of this co- 
variance to large scales, causing the information at linear 
scales to depart from the expected Gaussian information, 
equation 

While there is a certain arbitrariness about choosing 
upper Cholesky decorrelation over other possibilities, the 
resulting cumulative information fc max ) has the virtue 
of a simple interpretation: it is the information in the power 
spectrum P(k) at wavenumbers k ^ fc max , uncontaminated 
by information in power at smaller scales. 

Because the Fisher matrix of the uncorrelated band 
powers B(k) = Bk is by definition diagonal, equation Q 
reduces to a sum over a single wavenumber and the cumu- 
lative information is: 



k) 



E 



dlnB(k) d 2 ln£ din B(k) 
dlnA dlnB(k) 2 dlnA 



(15) 



3 SIMULATIONS 

The simulations used in this paper are the same as those 
used in Paper I. The main ensemble comprises 600 gravita- 
tional iV-body simulations of the concordance cosmological 
model: 400 simulations with a box size of 256 /i" 1 Mpc and 
a further 200 with a box size of 128 h' 1 Mpc. These simu- 
lations were evolved using a particle-mesh (PM) code with 
128 3 dark matter particles and a 256 3 force mesh. 

An additional 25 simulations, also with a box 
size of 128 h~ x Mpc, were run using a parallel ver- 
sion of the adaptive mesh refinement (AMR) code ART 
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Figure 1. Evolution of the non-linear power spectrum. Top panel: 
mean power spectrum from the 256 h~ 1 Mpc PM simulations 
(open points, with error bars derived from the scatter between 
individual simulations); the 128 h~ 1 Mpc PM simulations (filled 
points with error bars); and the 128 h~ 1 Mpc ART simulations 
(stars). Power spectra are shown for three epochs (bottom to 
top: a = 0.5, 0.67 and 1). The linear power spectrum is shown by 
the dotted curves in each panel. The soli d and dashed curves are, 
respectively, from the fit ting formulae of lSmith et alj J2003I) and 
IPeacock fc: Dodda Jl996l) , The dot-dashed line marks the level of 
the shot noise in the 256 h~ 1 Mpc simulations; the shot noise in 
the 128 h~ 1 Mpc simulations is a factor of 8 lower. Bottom panel: 
deviation between the mean power spectra of weighted densities 
and the power spectrum of the unweighted density. The points 
are medians from 100 of the PM simulations of each box size and 
the 25 ART simulations at the same 3 epochs. Error bars mark the 
upper and lower quartiles of the distribution. 



jKravtsov. Klypin fc Khokhlovlll997i) . Alone, this is an in- 
sufficient number to give precise statistics, but together with 
the much larger ensemble of PM simulations, the ART simu- 
lations serve as a useful check of our results on small scales, 
where the AMR technique is more accurate. For the ART 
simulations, we used 128 3 particles and a 128 3 root mesh 
with, at most, three levels of refinement, giving a maximum 
spatial resolution equivalent to that of a 1024 3 mesh in dense 
regions. Gaussian initial conditions for the ART simulations 
were set up using the publicly available GRAFIC package. 

The cosmologica l parameters adopted are those of 
iTeemark et al] i2004l second-last column of their table 4): 



(fi Mj Ov, fb, h, as) = (0.29, 0.71, 0.16, 0.71, 0.97) 



(16) 



This choice of parameters gives the best fit to the combi- 
nation of the power spectrum of fluctuations in the cosmic 
microwave background as measured by the Wilkinson Mi- 
crowave Anisotropy Probe (WMAP) and galaxy clustering 
as measured by the Sloan Digital Sky Survey (SDSS), un- 
der the important assumptions (among others) of a spa- 
tially flat universe (fifc = 0) with a cosmological constant 
(w = — 1) and a scale-invariant primordial power spectrum 
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(n s — 1). Each simulation was seeded with a different, ran- 
domly chosen realization of a Gaussian random field with 
a power spectrum corresponding to the above cosmological 
model. The matt er transfer function was calculated from the 
fitting formula of lEisenstein fc Hul (ll99ST) for universes with 
a significant baryon content. 

The non-linear power spectra of the above simulations 
are plotted in Fig.^for three epochs: a = 0.5, 0.67 and 1.0, 
with error bars showing the scatter between individual re- 
alizations. Power spectra were calculated by Fourier trans- 
forming the weighted density field on a single 256 3 grid 1 . 
The resulting power spectra were corrected for smoothing 
by dividing by the square of the Fourier transform of the 
mass assignm ent window, prior t o subtracting the shot noise 
contribution JSmith et alJl2003l) . As is to be expected, the 
power in the PM simulations falls significantly below that 
measured in the higher resolution ART simulations at small 
scales, as a result of mesh effects in the PM simulations. 
The power spectrum from the ART simulations, on the other 
hand, ag rees well with the f its to previ ous TV-body s imula- 
tions by iPeacock fc Doddsl Jl996h and ISmith et all (I2003T) 
at the scales of interest. We restrict our analyses to scales 
for which corrections to power from particle-cell assignment 
are small, and the shot noise sub-dominant, so uncertainties 
in both of these corrections should not affect our results. 

In what follows, where the results from the 128 Mpc 
PM and ART simulations are consistent, we combine them 
into a single data set for clarity. 



4 CONSTRUCTING THE COVARIANCE 
MATRIX 

Estimating the amount of information in a set of mea- 
surements requires knowledge of their covariance matrix. 
The most direct way to measure the covariance of esti- 
mates of the power spectrum is to run an ensemble of N- 
body simulations, each having a di fferent, random re aliza- 
tion of the initial densit y field (e.g. iMeiksin fc Whitelll999t 
IScoccimarro et all 119991 : Paper I) - a computationally ex- 
pensive endeavour because many hundreds of realizations 
are req uired to yield an accurat e estimate of the covariance 
matrix jMeiksin fc Whitdll999t Paper I). 

Alternative approaches to measuring covariances in- 
clude 'jackknife', in which ensembles are formed by remov- 
ing selected data from the original sample, and 'bootstrap', 
in whi ch ensembles a re formed by resampling with replace- 
ment teunschlll989l) . These methods work provided that 
the data being sampled comprise independent random vari- 
ables drawn from the same distribution. For correlated data, 



1 'Chaining' i Jenkins et alj|l99Sfl . a clever way to extend mea- 
surements of the power spectrum to smaller scales by superpos- 
ing the eight octants of a periodic cube on to a single octant, 
unfortunately cannot be applied to the measurement of covari- 
ance, because it involves a reduction in the number of Fourier 
modes. For Gaussian fluctuations, each mode is independent, so 
each folding reduces the number of modes by a constant factor 8. 
At non-linear scales, however, adjacent modes are highly corre- 
lated, so subsampling them by a factor of 8 does not reduce the 
effective number of modes correspondingly. 



iKiinschl 1^89) suggested an extension to the bootstrap ap- 
proach, in which the data are first split into blocks whose 
length is larger than the characteristic length of the correla- 
tions, and these blocks are then re-sampled to generate the 
bootstrap sample. In early experiments, we tried a form of 
'block bootstrap' in which we filled each octant of a sim- 
ulation cube with a block of data selected randomly from 
the cube. Unfortunately, the method did not work well. The 
sharp edges of the octants introduced spurious small-scale 
power, and the covariance of small-scale power differed sub- 
stantially from that measured by the ensemble method. The 
mathematical relation between the covariance of power ob- 
tained by the block bootstrap method and the true covari- 
ance of power is sufficiently obscure that it was difficult to 
assess the possible causes of the discrepancy. 

In HRS, we argue that all variations on jackknife and 
bootstrap are really just different ways of re-weighting the 
data to yield a new estimate of the quantity of interest, 
and that the best way to avoid unpleasant side-effects on 
spatial data is to re-weight with a smooth function of posi- 
tion. Further, by re- weighting the simulation with a function 
that contains only large-scale Fourier modes, unpleasant nu- 
merical artefacts in the power spectrum are confined to the 
largest scales, making them much easier to deal with. 



4.1 Covariance of power of weighted density 

Following HRS, we define the i'th weighted density by 

Pi (r) = Wi(r)p(r). (17) 

We use the minimum variance set of weightings recom- 
mended in section 3 of HRS. In real space the i'th weighting 
has the form 



;(r) = \/2cos 



2n fei • r + — 



where 



ki — 



{1,0,0} 
{1,1,0} 
{1,1,1} 



12 weightings 
24 weightings 
16 weightings. 



(18) 



(19) 



The different weightings are obtained by all possible reflec- 
tions and rotations of the components of k, which yields 
26 weightings in total, allowing for all of the symmetries 
in equation Q18f. A further 26 weightings are obtained by 
adding a phase shift of n/2, i.e. 1/16 — > 5/16 in equa- 
tion 1181 . equivalent to translating one of the coordinates 
by a quarter box (because of the symmetry of the weighting 
functions, only one such translation yields a distinct weight- 
ing). The estimate of power from this second set of weight- 
ings is predicted by HRS to be highly anti-correlated with 
that obtained from the first set of 26 weightings, but they 
are sufficiently uncorrelated that including all 52 weightings 
does yield a better measurement of the covariance matrix 
than is obtained from only 26 weightings. It is possible to 
apply more weightings constructed from higher order modes 
than those in equation (I19H but, as argued in section 3.5 
of HRS, these contain progressively less independent infor- 
mation, and yield progressively less accurate estimates of 
covariance. 

A covariance matrix estimated as the average of n dis- 
tinct estimates can have a rank no greater than n, a fact 
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Figure 2. Correlation matrices of estimates of non-linear power using the re-weighting scheme of HRS at 3 epochs (left to right a = 0.5, 
0.67 and 1). Correlation matrices were estimated separately for each simulation and then averaged over simulations. Results shown are 
averages over 100 PM simulations of each box size and additionally, in the case of the 128 /i~ x Mpc boxes, the 25 ART simulations. 
Greyscale is used to indicate the magnitude of the correlations, ranging from (no correlation) to 1 (perfect correlation). A heavy, black 
border outlines the region of each covariance matrix that is unaffected by numerical artefacts from the re- weighting scheme. Bins outside 
of the bordered area are excluded from further analyses. 



also pointed out recently bv lPan fc SzapudU ll200Fj) . The 52 
weightings given by equation 1191 prove sufficient to yield, 
for each simulation, a non-singular (no zero eigenvalues), 
and indeed positive definite (no negative eigenvalues) esti- 
mate of the covariance matrix for the 20 bins of wavenumber 
used here. 

As a practical matter, we implement the re-weighting 
scheme by weighting individual particles, before assigning 
the density to the Fourier mesh. The weighted overdensity 
at point j on the mesh is defined to be 



Si{rj) 



pA r i) 



(20) 



where p is the mean of the unweighted density field. Note 
that this is a different convention to that used in HRS, in 
which the quantity being transformed is 



Api(r) ee p,(r) - pi{r), 



(21) 



where pi(r) — Wi(r) and p = 1. The difference between 
these two conventions only affects the power on the largest 
scales (those wavebands containing modes appearing in 
equation [T§t and can most easily (and accurately) be cor- 
rected for in Fourier space. However, since the covariance on 



these scales is not correctly reproduced by the re-weighting 
method, we simply exclude them from our analyses. 

The covariance of power over the ensemble of weighted 
powers is related to the true covariance of power by (HRS) 



(AP(fc)AP(fc')) = 2{AP l {k)AP i {k')) i 



(22) 



where denotes an average over different weightings and 

the factor 2 corrects the covariance of the ensemble to the 
true covariance of estimates of power. The deviation APi(k) 
in the power spectrum of the i'th weighted density must 
be measured relative to some expected or mean value, and 
HRS discussed two possibilities. The first is to measure the 
deviation relative to the power spectrum of the unweighted 
density of the simulation: 

APi(k) ee Pi(k) - P(k). (23) 

The second is to measure the deviation relative to the mean 
of the power spectra of the weighted densities: 



AP(k) = P(k)-^J2 P ^ 



(24) 



The advantage of the first strategy, equation 12311 . is that the 
power spectrum of the unweighted density is, by symmetry, 
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a=0.5 a=0.67 a=l 




0.1 1.0 0.1 1.0 0.1 1.0 

comoving wavenumber k I h Mpc~' 



Figure 3. Correlations between estimates of the non-linear power spectrum using the re-weighting method, with A Pi defined by 
equation 1241 . The three columns show cross-sections through the correlation matrices at 3 epochs (left to right a = 0.5, 0.67 and 1), 
while each row shows a cross-section at a different wavenumber (k' = 0.057, 0.242 and 1.036 /iMpc" 1 ), marked by the vertical arrow in 
each panel. Symbols with error bars mark the median and quartiles of the distribution of correlation co-efficients measured from each 
individual simulation for the 256 h _1 Mpc PM (open symbols) and 128 h^ 1 Mpc PM+ART (filled symbols) simulations. Dotted error 
bars are used for points that are outside of the range of wavenumbers for which the covariance is expected to be reliably estimated by 
the re-weighting method. Solid and dashed lines show the correlations between (unweighted) estimates of power using the full ensemble 
of 256 h~ 1 Mpc PM and 128 h~ 1 Mpc PM+ART simulations, respectively. Data for the 128 h~ 1 Mpc boxes are missing in the top row 
because this entire cross-section lies outside the reliable region for this box size. 



a=0.5 a=0.67 a=l 




0.1 1.0 0.1 1.0 0.1 1.0 

comoving wavenumber k I h Mpc -1 



Figure 4. As Fig. [3] but with AP^ defined by equation 1231 . The symbols and lines have the same meanings as in Fig. [3] Together, the 
two figures demonstrate that the different methods give broadly consistent results. 




Figure 5. Variance of estimates of power using the re-weighting method with AP^ defined by (a) equation 1241 and (b) equation 1231 . 
Open symbols show the variance, normalized to the square of the power, at a = 1 in the 256 h _1 Mpc simulations (squares) and the 
128 ft -1 Mpc simulations (diamonds). The results for the 128 /i -1 Mpc box size have been shifted vertically by a factor 8 to compensate 
for the reduced density of modes and enable direct comparison between the two box sizes. Error bars mark the upper and lower quartiles 
of the distribution of estimates of variance over individual simulations. Solid symbols give the corresponding results measured using the 
ensemble method (these points are the same in both panels) . The solid and dashed curves are predictions from perturbation theory, both 
with (solid) and without (dashed) the beat-coupling terms (see HRS for details of the calculation). The heavier lines are the results for 
the 256 h,- 1 Mpc box size. The dotted line is the expected variance for Gaussian fluctuations. 



the most accurate estimate of power in a simulation, so the 
statistical uncertainty is potentially least in this case. How- 
ever, the power spectra of weighted densities are likely to 
be slightly biased relative to the power spectrum of the un- 
weighted density, because weighting the density effectively 
smooths the power, which biases it if power is other than a 
linear function of wavenumber. The advantage of the second 
strategy, equation 1241 . is that it removes this slight bias, so 
the systematic uncertainty is potentially smaller in this case. 

The lower panel of Fig. shows the median and quar- 
tiles of the distribution of the deviations between the av- 
eraged power spectra of weighted densities and the power 
spectrum of the unweighted density of each simulation (c.f. 
equation 20 of HRS). The two agree well on small scales, 
but on larger scales they can differ by up to 20 percent in 
extreme cases. 

In the following sections, we show results for both 
strategies, equations 1231 and 1241 . and find that the two 
strategies give consistent results. However, we do find that 
equation 1231 gives variances that are slightly but system- 
atically higher than those from equation 124L which we 
attribute to the systematic bias between the power spec- 
tra of weighted densities versus the power spectrum of the 
unweighted density. For the purposes of computing infor- 
mation, Section 13 we therefore choose the second strategy, 
equation I24H . 

Each set of weightings in equation 1191 is generated 
from a different wavenumber ki = \ki\, so we expect es- 
timates of power from each set to be biased in a slightly 
different way. This bias is small; nevertheless it is preferable 
to estimate the covariance matrix separately from each set 
of weightings and then combine them to obtain a single es- 



timate of the covariance matrix, weighting by the number of 
weightings in each set. This is the procedure that we adopt 
in Section |S] 

4.2 Tests 

In this section we describe several tests of the measurement 
of covariance of power. In Section 14.2.11 we show that the 
weightings and ensemble methods give consistent results for 
the correlation coefficients of band-powers. By contrast, in 
Section EHHl we show that the two methods give substan- 
tially different results for the variance of power at non-linear 
scales. In Section al . 2 . 31 we check the assumption that the dis- 
tribution of estimates of power is (thanks to the central limit 
theorem) adequately Gaussian. 

4-2.1 Correlations in the power spectrum 

Fig. [5] shows the matrix of correlation coefficients, 



Pkk' 



(AP k AP k ,) 



(25) 



of estimates of the non-linear power spectrum at three 
epochs, measured using the re-weighting scheme outlined in 
Section ^. II Each matrix is the average result from 100 indi- 
vidual PM simulations and, in the case of the 128 ft" 1 Mpc 
boxes, 25 ART simulations. The final epoch (a — 1) can be 
directly compared to fig. 2 of Paper I, in which we show 
the results from the ensemble method. We expect numerical 
artefacts from the re- weighting to be restricted to wavenum- 
bers k ^ \/3fcb, the highest wavenumber used in the weight- 
ing functions. Fig. [5] shows that this is indeed the case: the 
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degree of correlation changes abruptly between bins contain- 
ing wavenumbers inside and outside this limit. In the follow- 
ing analysis, we use only bins with wavenumbers a factor of 
2 away from the fundamental mode of the box, k ^ 2k^. 

Fig. [3] shows three cross-sections through each of the 
correlation matrices in Fig. |2] The data plotted are the me- 
dians and quartiles of the distribution over the independent 
realizations. The measurements from the re- weighting and 
ensemble methods are generally consistent, although the re- 
weightings method tends to yield somewhat higher correla- 
tions at smaller scales, and higher for the 256 ft, -1 Mpc boxes 
than the 128 /i _1 Mpc boxes. The scatter between individual 
simulations is considerable, especially where the correlation 
coefficient is small. 

Fig. |3] shows the same results, but with the deviations 
APj(fe) in the power spectra of weighted densities being 
measured relative to the power spectrum of the unweighted 
density, equation 12311 . as opposed to the mean of the power 
spectra of weighted densities, equation (1241 . The correlation 
coefficients are similar to those shown in Figure as they 
should be. 



4-2.2 Variance of the power spectrum 

While the re-weighting and ensemble methods yield consis- 
tent results for the correlation matrix of non-linear power, 
the variance of (and hence the information contained in) the 
non-linear power spectrum is an altogether different matter. 

Fig. shows the variance, normalized to the square 
of the unweighted power spectrum, estimated using the re- 
weighting method. We show results for both equation 12311 
and equation (1241 . Notice that, as expected, equation 12311 
overestimates the variance of power on large scales. On small 
scales, however, the results are entirely consistent. 

For Gaussian fluctuations, the normalized variance 
equals 2/JVfc (equation|5Jl, where is the number of modes 
in a given wavenumber bin. For a periodic box, the number 
of modes in a set of logarithmically-spaced bins increases 
with central wavenumber as Nk. oc fc 3 . The 128 h~ x Mpc 
simulations have fewer modes (by a factor of 8) at a given 
wavenumber, so the results for this box size have been shifted 
vertically down by this factor to allow a direct comparison 
between the results for the two box sizes. As in Fig. |3 the 
data shown in Fig. are medians over many individual sim- 
ulations, with error bars marking the quartiles of the dis- 
tribution. The results for the 128 Mpc PM simulations 
and the 128 h~ l Mpc ART simulations are consistent, so we 
combine them into a single set of points for clarity (a figure 
showing the ART results alone appears as fig. 2 of HRS). 

The two different box sizes yield consistent results when 
the ensemble method is used. For the re-weighting method, 
on the other hand, there are clear discrepancies, both be- 
tween the re-weighting method and the ensemble method 
and between the two box sizes. At translinear and non-linear 
scales, the variance measured by re- weighting is significantly 
higher than that measured for the ensemble, particularly for 
the larger box size, and the discrepancy grows ever larger at 
smaller scales. 

Fig.|^|also shows the predictions of perturbation theory. 
The calculation of these curves is described in section 4.3 of 
HRS. Perturbation theory helps to explain the discrepancies 
both between the results of the ensemble and re-weighting 
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Figure 6. Distributions of estimates of non-linear power, equa- 
tion 1241 . using the re- weighting method (top panel) and the 
ensemble method (bottom panel). Individual histograms for all 
bands with wavenumbers k > 0.2 h~ 1 Mpc, containing at least 
2000 Fourier modes, are scaled to unit variance and stacked. The 
smooth curves are Gaussian fits to the data, assuming Poisson 
weighting of the counts in each bin. 



methods, and between the two different box sizes. The vari- 
ance measured by the re-weighting technique departs from 
the ensemble result where the (constant) beat-coupling term 
(equation 94 of HRS) becomes the dominant source of vari- 
ance. The source of this term - coupling of closely-spaced 
Fourier modes to the large-scale beat mode between them - 
is discussed in section 4 of HRS. The magnitude of the beat- 
coupling term is proportional to the amplitude of the power 
spectrum on roughly the size of the box, which explains 
why different sizes of simulation box yield systematically 
different estimates of the small-scale variance. Perturbation 
theory correctly predicts the magnitude of the discrepancy 
between the small-scale variance estimated from the two dif- 
ferent sizes of simulation. 

Note that perturbation theory fails to reproduce the 
correct non-linear variance for the ensemble method (dashed 
lines), presumably because perturbation theory breaks down 
in the highly non-linear regime. This may be responsible 
for the small discrepancies between the results of the re- 
weightings method and the theoretical curves in Fig. |S] 

4-2.3 Distribution of estimates of power 

In Section|5|we asserted that, thanks to the central limit the- 
orem, estimates of power should be Gaussianly distributed 
about their expectation value, even in the highly non-linear 
regime. Since our analysis relies on the validity of this as- 
sumption it is worth putting to the test. 

Fig- ED shows the distribution of deviations of estimates 
of power using both the re-weighting method and the en- 
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Figure 7. Cumulative information in the non-linear power spectrum at 3 epochs (top to bottom: a = 0.5, 0.67 and 1), as a function of (a) 
comoving and (b) physical wavenumber. Large symbols are points derived from the 256 /i -1 Mpc PM simulations and small simulations 
are for the 128 h _1 Mpc PM+ART boxes. The results for the 128h _1 Mpc boxes have been shifted vertically by a factor 8 to account 
for the higher density of modes at a given comoving fc and allow direct comparison with the larger box size. The dotted line marks the 
expected amount of information for Gaussian fluctuations and the solid curves are the result of applying the PD96 scaling to the dotted 
line at each epoch. This is a revised version of fig. 3 in Paper I (see text for details). 



semble method for wavenumbers in the non-linear regime 
(which we take to be k > 0.2 ft -1 Mpc). Individual his- 
tograms for each waveband have been scaled to unit vari- 
ance and stacked to produce a single distribution for each 
method. The estimates of power from the re- weighted simu- 
lations are indeed distributed close to Gaussianly. Assuming 
Poisson statistics, the value of x 2 for the fit is 5.84 per degree 
of freedom, which seems reasonable, given the high degree of 
correlation between estimates of non-linear power. For the 
ensemble method, the distribution is also close to Gaussian, 
although there are deviations from Gaussianity - in partic- 
ular the presence of a sharper than Gaussian peak and a tail 
of large, positive deviations that cause the actual mean and 
variance of power to be slightly larger than the fitted values. 
The value of \ 2 m this case is 3.86 per degree of freedom, 
for Poisson statistics. 



5 INFORMATION CONTENT OF THE NON- 
LINEAR POWER SPECTRUM 

In the previous section, we showed that measuring the co- 
variance of non-linear power by re-weighting an individual 
simulation yields substantially different results at non-linear 
scales than is found from the scatter over an ensemble of sim- 
ulations. The discrepancy is consistent with the explanation 
proposed by HRS: beat-coupling between the covariance on 
non-linear scales and the power on large scales. The dif- 
ference in covariance between the two methods translates 
directly into a difference in the quantity of information in 
the non-linear power spectrum. 



5.1 Method 



We decorrelate estimates of the (log-) amplitude for each 
simulation individually, using the covariance matrix esti- 
mated using the re- weighting method. As our fiducial power 
spectrum in equation © we use the true (unweighted) 
power in each simulation. The re- weighting method restricts 
the range of wavenumbers for which the covariance matrix - 
and hence the Fisher information - can be reliably measured 
to k > \/3fcb- Since our purpose is to measure the cumulative 
information, equation up to some wavenumber fc max , the 
contribution from wavenumbers smaller than this limit must 
be taken into account. For the 256 ft -1 Mpc boxes we assume 
that the excluded bins contain the expected amount of in- 
formation for Gaussian fluctuations (Nk/2, where Nk is the 
number of Fourier modes in those bins). That this is a rea- 
sonable assumption is confirmed by the fact that the first few 
bins for which we do have measurements of the variance us- 
ing the re- weighting method follow the Gaussian expectation 
closely (see the top panel of Fig. 0. For the 128 h~ 1 Mpc 
boxes there is a further complication. The lower limit on 
the wavenumbers accessible using the re-weighting method 
brings us into the regime in which non-linear effects start 
to become important. The most reasonable thing to do here 
would seem to be to use the results from the 256 ft -1 Mpc 
boxes to estimate the quantity of missing information. Al- 
though there are clearly systematic differences between the 
results of the re-weighting method for different box sizes, 
these are small at the scales in question. 
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Figure 8. Cumulative information, as a function of (a) comoving and (b) physical wavenumber, for the same 3 epochs as Fig. [7] Points 
with error bars are medians and quartiles of the results measured using the re-weighting method on 100 PM simulations with a box size 
of 256 h~ 1 Mpc. For clarity, the three sets of points have been artificially separated by a small factor in wavenumber, with the a = 0.67 
points having the correct wavenumber. For comparison, the ensemble results from Fig. |3 are shown as light, dashed lines. The dotted 
line marks the expected amount of information for Gaussian fluctuations and the solid curves are the result of applying the PD96 scaling 
to the dotted line at each epoch. 




comoving wavenumber k/hMpc' 1 physical wavenumber d x k I h Mpc 



Figure 9. Cumulative information, as Fig. [S] for 100 PM simulations and 25 ART simulations with a box size of 128 ft" 1 Mpc. The 
results have been shifted vertically by a factor of 8 to enable direct comparison with the larger box size. The symbols and lines have the 
same meanings as in Fig. [5] but note the different axis ranges. 



5.2 Results 

5.2.1 Ensemble method 

Fig. |7| shows the cumulative information as a function of 
wavenumber at 3 epochs (a = 0.5, 0.67 and 1) for the en- 
semble method of estimating the covariance of power. 

The curves in Fig.|7|difrer somewhat from those in fig. 3 



of Paper I. In the previous paper, we (incorrectly) decorre- 
lated the power spectrum prior to multiplying by the par- 
tial derivatives in equation J7J. As discussed in Section [2.41 
this is only a good approximation to the exact result pro- 
vided that the band-power windows are narrow in k, which 
is a poor assumption in our case because of the existence 
of broad correlations in the power spectrum. The curves in 
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Fig. □ (and Figs. H and ^ later) were produced using the 
exact formalism set out in Section f2.4l of the present paper. 
We still find that the ensemble method yields very little 
independent information in the translinear regime (k ~ 0.2- 
0.8 ft, -1 Mpc), over and above that in the linear regime, al- 
though the flatness is not as pronounced as suggested by 
our previous results. In fact, the cumulative information in- 
creases by a factor of approximately 2 over the aforemen- 
tioned range of wavenumbers (at a = 1), still much less 
than the factor of k 3 ~ 64 expected for linear fluctuations. 
There remains a sudden upturn in the cumulative informa- 
tion on small scales, implying that the power spectrum at 
fully non-linear scales contains unique information about the 
amplitude of the initial power spectrum, that is not found 
in the present day linear power spectrum. 

In Paper I, we interpreted the decrease in information 
in the translinear regime as the result of rapid transfer of 
information from larger to smaller scales. An alternative ex- 
planation, which was mentioned in Paper I but discarded 
as being contrived, is that information is temporarily di- 
verted into higher order statistics, such as the bispectrum, 
in the translinear regime. Since filamentary structures are 
more readily described by higher-order statistics than by 
the power spectrum alone, it is entirely plausible that, on 
translinear scales, the bispectrum does contain information 
not present in the power spectrum, about the initial con- 
ditions of structure formation, and that this information is 
somehow returned to the power spectrum on smaller scales. 

We argued in Paper I that, if information is conserved 
overall then, under the assumption of stable clustering, the 
amount of information up to a given physical (as opposed to 
comoving) wavenumber ought to be independent of time in 
the fully non-linear regime. The right panel of Fig. UJshows 
the cumulative information for the same three epochs, plot- 
ted as a function of physical wavenumber k/a. The results 
are consistent with the hypothesis that information is largely 
conserved by non-linear evolution. 

5.2.2 Re-weighting method 

In Figs. Hand 03 we compare the results from the ensemble 
method with those from the re-weighting method. For the 
re-weighting method we show the median and quartiles of 
the distribution of results from the individual simulations. 
Qualitatively, the behaviour of the cumulative information 
- as a function of both wavenumber and cosmic epoch - for 
the re-weighting method is similar to that for the ensemble 
method. However, the flattening in the translinear regime is 
less pronounced than for the ensemble case and there is no 
clear evidence for an upturn on small scales, although the 
curves for the re-weighting method follow the average slope 
of those from the ensemble method. Overall, the information 
measured using the re- weighting method is considerably less 
than when the ensemble method is used. Beat-coupling to 
large scales prevents much of the information that is, in prin- 
ciple, contained in the power spectrum from being extracted 
when the re-weighting method is used. It is also worth not- 
ing that the magnitude of the beat-coupling effect depends 
on the size of the simulation, so that the two different box 
sizes are no longer in complete agreement on small scales. 
The 128 h' 1 Mpc boxes (Fig.® are in closer agreement with 
the ensemble method, as expected. 



As predicted by perturbation theory, the beat-coupling 
contribution to the covariance, which is a factor ~ 
P(2kh) / P(k) larger than the other terms at a given k, 
becomes increasingly dominant at smaller scales. For the 
power spectrum considered here, the contribution from beat- 
coupling also increases with cosmic epoch. We would there- 
fore not expect the cumulative information, measured using 
the re-weighting method, up to a given physical wavenum- 
ber to be necessarily constant over time, even in the stable 
clustering regime. 



6 SUMMARY 

This paper extends and expands on the results reported in 
Paper I concerning the Fisher information contained in the 
non-linear power spectrum about the amplitude of the initial 
(post-recombination) linear power. 

In Paper I, we measured the covariance of power from 
the scatter in power over a large ensemble of simulations. 
Here we reported measurements of covariance of power from 
both the ensemble method and a new method, described in a 
companion paper (HRS) , in which smoothly varying weight- 
ing functions are applied to each simulation to yield a sepa- 
rate estimate of the covariance of power for each simulation. 

We have shown that the two methods yield substantially 
different estimates for the covariance of power at non-linear 
scales. This does not mean, however, that one or other of 
the methods is incorrect. Rather, it turns out that measur- 
ing the covariance of power is a more subtle problem than 
we had previously suspected. Beat-coupling - the process 
whereby the covariance between Fourier modes separated 
by a small wavevector couple by gravitational growth to the 
large-scale beat mode between them - dominates the co- 
variance on non-linear scales. We compared our results to 
a calculation using perturbation theory (HRS) and found 
that the theory explains qualitatively the features of beat- 
coupling, seen in the simulations. 

Beat-coupling contributions to the covariance of power 
occur whenever Fourier modes have a finite width, as op- 
posed to being delta-functions at discrete wavevectors. As 
argued by HRS, this means that beat-coupling is likely to 
be important in real galaxy surveys. The ensemble method, 
on the other hand, measures covariances between the am- 
plitudes of Fourier modes with precisely defined (delta func- 
tion) wavenumbers. If information is conserved by non-linear 
evolution then it is this quantity that, overall, remains in- 
variant with time. 

Theory predicts that the effects of beat-coupling will 
be greatest when the largest scales in a survey are close 
to the peak of the power spectrum (A: ~ 0.016 /iMpc - ; 
7l/k ~ 200 h~ 1 Mpc). If this is true then our results suggest 
the covariance of power in such a survey will be dominated 
by beat-coupling on small scales and, counter-intuitively, 
will be sensitive to the power on the largest scales in the 
survey, leading to a reduction in the amount of informa- 
tion extractable from the power spectrum. As pointed out 
in HRS, the best way to test this is using mock galaxy cata- 
logues drawn from a single large simulation using the same 
selection function as the survey observations. 
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